我们提出了一种算法,以估计反向和前向kullback-leibler差异的路径梯度,以明显可逆地归一流。与标准的总梯度估计器相比,所得的路径梯度估计器可直接实施,具有较低的差异,不仅可以提高训练的速度更快,而且导致总体近似结果更好。我们还证明,路径梯度训练不太容易受到模式折叠的影响。鉴于我们的结果,我们期望路径梯度估计器将成为训练归一化流量的新标准方法。
translated by 谷歌翻译
最近的工作已经为简单的高斯分布建立了一个路径梯度估计量,并认为该路径梯度在变化分布接近确切目标分布的状态下尤其有益。但是,在许多应用中,这种制度无法通过简单的高斯分布来达到。在这项工作中,我们通过提出一个途径梯度估计量来克服这一关键限制,以使连续归一化流的表达性变异家族更加表现力。我们概述了一种有效的算法来计算该估计器并通过经验建立其出色的性能。
translated by 谷歌翻译
估计自由能,以及其他热力学可观察,是格子田间理论中的关键任务。最近,已经指出,可以在这种情况下使用深生成的模型。至关重要的是,这些模型允许在参数空间中的给定点处直接估计自由能。这与基于Markov链条的现有方法形成对比,这些方法通常需要通过参数空间集成。在这一贡献中,我们将审查这种基于机器学习的估算方法。我们将详细讨论模式崩溃问题和大纲缓解技术,这些技术特别适用于有限温度的应用。
translated by 谷歌翻译
This study targets the mixed-integer black-box optimization (MI-BBO) problem where continuous and integer variables should be optimized simultaneously. The CMA-ES, our focus in this study, is a population-based stochastic search method that samples solution candidates from a multivariate Gaussian distribution (MGD), which shows excellent performance in continuous BBO. The parameters of MGD, mean and (co)variance, are updated based on the evaluation value of candidate solutions in the CMA-ES. If the CMA-ES is applied to the MI-BBO with straightforward discretization, however, the variance corresponding to the integer variables becomes much smaller than the granularity of the discretization before reaching the optimal solution, which leads to the stagnation of the optimization. In particular, when binary variables are included in the problem, this stagnation more likely occurs because the granularity of the discretization becomes wider, and the existing modification to the CMA-ES does not address this stagnation. To overcome these limitations, we propose a simple extension of the CMA-ES based on lower-bounding the marginal probabilities associated with the generation of integer variables in the MGD. The numerical experiments on the MI-BBO benchmark problems demonstrate the efficiency and robustness of the proposed method. Furthermore, in order to demonstrate the generality of the idea of the proposed method, in addition to the single-objective optimization case, we incorporate it into multi-objective CMA-ES and verify its performance on bi-objective mixed-integer benchmark problems.
translated by 谷歌翻译
Adversarial attacks on thermal infrared imaging expose the risk of related applications. Estimating the security of these systems is essential for safely deploying them in the real world. In many cases, realizing the attacks in the physical space requires elaborate special perturbations. These solutions are often \emph{impractical} and \emph{attention-grabbing}. To address the need for a physically practical and stealthy adversarial attack, we introduce \textsc{HotCold} Block, a novel physical attack for infrared detectors that hide persons utilizing the wearable Warming Paste and Cooling Paste. By attaching these readily available temperature-controlled materials to the body, \textsc{HotCold} Block evades human eyes efficiently. Moreover, unlike existing methods that build adversarial patches with complex texture and structure features, \textsc{HotCold} Block utilizes an SSP-oriented adversarial optimization algorithm that enables attacks with pure color blocks and explores the influence of size, shape, and position on attack performance. Extensive experimental results in both digital and physical environments demonstrate the performance of our proposed \textsc{HotCold} Block. \emph{Code is available: \textcolor{magenta}{https://github.com/weihui1308/HOTCOLDBlock}}.
translated by 谷歌翻译
Although Deep Neural Networks (DNNs) have achieved impressive results in computer vision, their exposed vulnerability to adversarial attacks remains a serious concern. A series of works has shown that by adding elaborate perturbations to images, DNNs could have catastrophic degradation in performance metrics. And this phenomenon does not only exist in the digital space but also in the physical space. Therefore, estimating the security of these DNNs-based systems is critical for safely deploying them in the real world, especially for security-critical applications, e.g., autonomous cars, video surveillance, and medical diagnosis. In this paper, we focus on physical adversarial attacks and provide a comprehensive survey of over 150 existing papers. We first clarify the concept of the physical adversarial attack and analyze its characteristics. Then, we define the adversarial medium, essential to perform attacks in the physical world. Next, we present the physical adversarial attack methods in task order: classification, detection, and re-identification, and introduce their performance in solving the trilemma: effectiveness, stealthiness, and robustness. In the end, we discuss the current challenges and potential future directions.
translated by 谷歌翻译
基于量子的通信中的当前技术将量子数据的新集成与经典数据进行混合处理。但是,这些技术的框架仅限于单个经典或量子任务,这限制了它们在近期应用中的灵活性。我们建议在需要经典和量子输入的计算任务中利用量子储存器处理器来利用量子动力学。该模拟处理器包括一个量子点网络,其中量子数据被入射到网络中,并且经典数据通过一个连贯的字段刺激了网络进行编码。我们执行量子断层扫描和经典通道非线性均衡的多任务应用。有趣的是,可以通过对经典数据的反馈控制以闭环方式进行断层扫描。因此,如果经典输入来自动力学系统,则将该系统嵌入封闭环中,即使访问对外部经典输入的访问被中断也可以处理混合处理。最后,我们证明准备量子去极化通道是一种用于量子数据处理的新型量子机学习技术。
translated by 谷歌翻译
本文提出了一种用于拆分计算的神经体系结构搜索(NAS)方法。拆分计算是一种新兴的机器学习推理技术,可解决在物联网系统中部署深度学习的隐私和延迟挑战。在拆分计算中,神经网络模型通过网络使用Edge服务器和IoT设备进行了分离和合作处理。因此,神经网络模型的体系结构显着影响通信有效载荷大小,模型准确性和计算负载。在本文中,我们解决了优化神经网络体系结构以进行拆分计算的挑战。为此,我们提出了NASC,该NASC共同探讨了最佳模型架构和一个拆分点,以达到延迟需求(即,计算和通信的总延迟较小,都比某个阈值较小)。 NASC采用单发NAS,不需要重复模型培训进行计算高效的体系结构搜索。我们使用硬件(HW) - 基准数据的NAS基础的绩效评估表明,拟议的NASC可以改善``通信潜伏期和模型准确性''的权衡,即,将延迟降低了约40-60%,从基线降低了约40-60%有轻微的精度降解。
translated by 谷歌翻译
当面对复杂的语义环境和各种孔模式时,现有的基于学习的图像介绍方法仍在挑战。从大规模培训数据中学到的先前信息仍然不足以解决这些情况。捕获的覆盖相同场景的参考图像与损坏的图像共享相似的纹理和结构先验,该图像为图像授课任务提供了新的前景。受此启发的启发,我们首先构建了一个基准数据集,其中包含10k对的输入和参考图像,以引入引导介绍。然后,我们采用编码器解码器结构来分别推断输入图像的纹理和结构特征,考虑其在indpaining期间的纹理和结构差异。进一步设计特征对齐模块,以通过参考图像的指导来完善输入图像的这些特征。定量和定性评估都证明了我们方法在完成复杂孔方面的优越性。
translated by 谷歌翻译
神经体系结构搜索(NAS)旨在自动化体系结构设计过程并改善深神经网络的性能。平台感知的NAS方法同时考虑性能和复杂性,并且可以找到具有低计算资源的表现良好的体系结构。尽管普通的NAS方法由于模型培训的重复而导致了巨大的计算成本,但在搜索过程中,训练包含所有候选架构的超级网的权重训练了一杆NAS,据报道会导致搜索成本较低。这项研究着重于体系结构复杂性的单发NAS,该NA优化了由两个指标的加权总和组成的目标函数,例如预测性能和参数数量。在现有方法中,必须使用加权总和的不同系数多次运行架构搜索过程,以获得具有不同复杂性的多个体系结构。这项研究旨在降低与寻找多个体系结构相关的搜索成本。提出的方法使用多个分布来生成具有不同复杂性的体系结构,并使用基于重要性采样的多个分布获得的样本来更新每个分布。提出的方法使我们能够在单个体系结构搜索中获得具有不同复杂性的多个体系结构,从而降低了搜索成本。所提出的方法应用于CIAFR-10和Imagenet数据集上卷积神经网络的体系结构搜索。因此,与基线方法相比,提出的方法发现了多个复杂性不同的架构,同时需要减少计算工作。
translated by 谷歌翻译